Report: GWAS analysis

1 Project Summary

Parameter Value
Project test-gwas
Pipeline Version v0.2
Date 2021-08-09
Phenotype File test/regenie_pheno_input.pheno.validated.txt
Phenotype p21001_i0
Covariates COV1
Regenie Output p21001_i0.rarevars.regenie.gz

2 Phenotype Statistics

2.1 Overview

╭──────────────────────────────────────────────── skimpy summary ─────────────────────────────────────────────────╮
│          Data Summary                Data Types                                                                 │
│ ┏━━━━━━━━━━━━━━━━━━━┳━━━━━━━━┓ ┏━━━━━━━━━━━━━┳━━━━━━━┓                                                          │
│ ┃ dataframe          Values ┃ ┃ Column Type  Count ┃                                                          │
│ ┡━━━━━━━━━━━━━━━━━━━╇━━━━━━━━┩ ┡━━━━━━━━━━━━━╇━━━━━━━┩                                                          │
│ │ Number of rows    │ 63930  │ │ float64     │ 1     │                                                          │
│ │ Number of columns │ 1      │ └─────────────┴───────┘                                                          │
│ └───────────────────┴────────┘                                                                                  │
│                                                     number                                                      │
│ ┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓  │
│ ┃ column_name          NA     NA %      mean      sd      p0     p25     p75     p100     hist      ┃  │
│ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━┩  │
│ │ p21001_i0              0       0      27   5.2   13    24    30     68   ▂█▂    │  │
│ └─────────────────────┴───────┴──────────┴──────────┴────────┴───────┴────────┴────────┴─────────┴───────────┘  │
╰────────────────────────────────────────────────────── End ──────────────────────────────────────────────────────╯

2.2 Phenotype distribution

3 Main results

Following plots summarise results across all tests and AF bins by plotting only the most significant result for each gene. Note that in case of ties all results with the same pvalue are plotted.

3.1 Manhattan plot - overall results

The first plot represents results using raw LOG10P values.

  • the dashed red line represents the Bonferroni threshold when correcting across all results
  • the dashed orange line represents the highest (more stringent) Bonferroni threshold when correcting for each test group separately (namely each combination of test * mask * afbin).
  • the blue dashed line represent the P value for which the global FDR (FDR computed across all results) is <= 0.05. This line is missing when there are no results with global FDR <= 0.05.
  • the green line represent your configured threshold (5.0). Genes above this line are annotated.

3.2 Manhattan plot - configured statistics

In the second Manhattan plot the Y axis represents corrected P values based on the configured value: LOG10P_FDR_bygroup.

Here, the red line represents the configured threshold (1.3) and genes above this line are annotated. Please note that many point may be not visible since tests with a low raw LOG10P often result in near zero LOG(corrected P) and thus many points may be collapsed at the zero baseline.

4 Detailed results

The following section contains detailed results with Manhattan plots and QQ plots for each test group, namely each specific combination of test, mask and AF bin. In these plots the dashed red line represents the bonferroni threshold for the specific test group.

4.1 Burden test

Results for standard burden test are showed separately for each mask and AF bin.

4.1.1 Mask CDS - AF bin 0.01

4.1.2 Mask CDS - AF bin 0.05

4.1.3 Mask CDS - AF bin singleton

4.1.4 Mask ProtChanging - AF bin 0.01

4.1.5 Mask ProtChanging - AF bin 0.05

4.1.6 Mask ProtChanging - AF bin singleton

4.1.7 Mask LoF - AF bin 0.01

4.1.8 Mask LoF - AF bin 0.05

4.1.9 Mask LoF - AF bin singleton

4.1.10 Mask HC_LoF - AF bin 0.01

4.1.11 Mask HC_LoF - AF bin 0.05

4.1.12 Mask HC_LoF - AF bin singleton

4.2 Other rare variants tests

Results for other gene tests (SKAT, ACAT, …) are showed separately for test and mask.

4.2.1 Test ADD-SKAT - Mask CDS

4.2.2 Test ADD-SKAT - Mask ProtChanging

4.2.3 Test ADD-SKAT - Mask LoF

4.2.4 Test ADD-SKAT - Mask HC_LoF

4.2.5 Test ADD-SKATO - Mask CDS

4.2.6 Test ADD-SKATO - Mask ProtChanging

4.2.7 Test ADD-SKATO - Mask LoF

4.2.8 Test ADD-SKATO - Mask HC_LoF


This report has been created with nf-fast-regenie v0.2, a nextflow pipeline developed by Edoardo Giacopuzzi at the Human Technopole Foundation, Milan, Italy. Plots are generated using gwaslab package